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Abstract 

Let A be an a-letter alphabet. We consider fractional powers of 
^4-strings: if x is a n-letter string, x r is a prefix of xxxx . . . having 
length nr. 

Let I be a positive integer. Hie, Ochem and Shallit defined R(a, I) as 
the infimum of reals r > 1 such that there exist a sequence of A-letters 
without factors (substrings) that are fractional powers x r where x has 
length at least I and r' > r. 

We prove that 1 + < R(a, I) < 1 + for some constant c. 

1 Introduction 

A fractional power x r of a string x is defined as x r = xxx . . . xxy where y 
is a prefix of x and \x r \ = r\x\. (We assume that r > 1 is a fraction with 
denominator \x\.) 

One may ask whether there exists an infinite sequence of letters that does 
not contain fractional powers x r with large r and long x. More precisely, 
for a given alphabet size a, a given integer I and a given real a one may 
ask whether there exists an infinite sequence of letters that does not contain 
fractional powers x r with r > a and \x\ > I. 

For a = 1 the answer is evidently negative (each string x is a fractional 
power x 1 ). On the other hand, it is easy to see that for any a > 2 and / > 1 
the answer is positive if a is large enough (there exists a binary sequence that 
does not contain factors x 3 ). The threshold value that separates negative 
and positive answers is denoted by R(a,l) in [7]; the authors note that 
1 < R(a,l) < 2 and compute exact values of R(a,l) for some pairs (a, I). 
Evidently, R(a, I) decreases when a or / increase. 

To get a lower bound for R(a,l), let us apply the pigeonhole principle to 
a + 1 letters at positions 0,1,21, ... ,al. Two of them should be equal and 
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this creates a fractional power x r where \x\ > I and r < 1 + 1/la (this power 
starts and ends with a letter that appears twice). Therefore, 

R(a,l) > 1 + — . 

fa 

Francesca Fiorenzi, Pascal Ochem and Elise Vaslet in [5] gave stronger 
lower bounds and also some upper bounds for R(a,l). In particular, they 
proved that 

1 . . 2lnl 



where A = ^4 and a constant in O may depend on a but not 

on /. 

In this paper we use Lovasz local lemma to prove a stronger upper bound 
for R(a,l). Our upper bound differs from the lower bound only by a con- 
stant: 

R(a,l) < l + £ 
la 

for some c and for all a > 2, Z > 1. 



2 Kolmogorov complexity of subsequences 

We present the proof using the notion of Kolmogorov complexity (also called 
algorithmic complexity or description complexity). We refer the reader to [1] 
or |10| for the definition and basic properties of Kolmogorov complexity. 

For an infinite sequence u and finite set X C N let uj(X) be a string of 
length j^X formed by Ui with i £ X (in the same order as in oj). 

We use the following result from [9] that guarantees the existence of a 
sequence oj such that strings ui(X) have high Kolmogorov complexity for all 
simple X: 

Theorem 1. Let a be a positive real number less than 1. There exists 
a binary sequence uj and an integer N such that for any finite set X of 
cardinality at least N the inequality 

K(X,u(X)\t) > a#X 

holds for some t £ A. 

Here K(X, uj(X)\t) is conditional Kolmogorov complexity of a pair (X, uj(X)) 
relative to t. 

We need a slightly more general version of this result (for any alphabet 
size): 
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Theorem 2. Let a > 2 be an integer. Let a be a positive real less than 1. 
There exists a sequence uj in a-letters alphabet and an integer N such that 
for any finite set X of cardinality at least N the inequality 

K(X,iu(X)\t) > a#Xloga 

holds for some t € X. 

Proof. Theorem [2] can be proven using exactly the same argument as in [9] 
(Lovasz local lemma technique). It can also be formally derived from Theo- 
rem [1] as follows: we encode a letters of the alphabet by bit blocks of some 
length t (large enough). This encoding is not bijective (several blocks en- 
code the same letter) but is chosen in such a way that all letters have almost 
the same number of encodings (about 2*/a). Then we take a sequence from 
Theorem [Tl split it into i-bit blocks and replace these blocks by correspond- 
ing letters. If some subsequence formed by the letters is simple, then the 
corresponding bit subsequence is simple, too. (Technically we should change 
a slightly to compensate for "boundary effects".) □ 



3 Weak upper bound 

To illustrate the technique, we first prove a simple generalization of a result 
obtained by Berk [6] and provide an upper bound for R(a, I) that is weaker 
that our final bound: 

Theorem 3. For every a > 2 and every real number b 6 (1, a) there exists a 
number N and a sequence uj in a-letters alphabet such that for every n > N 
the distance between any two different occurrences of the same substring of 
length n in uj is at least b n . 

Proof. Construct a sequence co using Theorem [2] with a close enough to 1. 

Let / and J (\L\ = \J\ = n) be different intervals where the same sub- 
string of length n occurs in uj. Let X = I U J. Then n < jfX < 2n 
(intervals I and J are not necessarily disjoint) and the first n letters of 
uj(X) are equal to the last n letters of uj(X). It is easy to see that the string 
uj{X) is determined by its first jfX — n letters, n and #X, so K(uj(X)) < 
(#X - n) log a + 0(log n). 

Assume t G X. Then X is determined by t, the number n, the distance 
between I and J and the ordinal number of t in X. So if the distance 
between / and J is less than b n then K(u(X), X\t) < (\X\ — n)loga + 
nlog6 + 0(logn) < a\X\ logn for large enough n and a that is close enough 
to 1 (because log b < log a). This contradicts the inequality of Theorem [2J 
Therefore sequence to does not contain a pair of different occurrences of the 
same substring of sufficiently large length n with distance between them less 
than b n . □ 
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In particular, for every integer a > 2, every real number b G (l,a) and 
for large enough Z the following inequality holds: 

R M <1 + ^. 

4 The final upper bound 

In the weak upper bound we used the same sequence for all values of I. 
And now we need different sequences for different values of Z but we want 
the constant c to be the same. To achieve this goal we use the following 
"Z-uniform" version of Theorem [TJ 

Theorem 4. Let a be a positive real number less than 1. There exists an 
integer N such that for every integer I there exists a binary sequence to that 
has the following property: for every finite set X of cardinality at least N 
the inequality 

K(X,oj(X)\t,l) > a#X 

holds for some t 6 A. 

Note that to may depend on Z while N is the same for all values of Z. (If 
we allowed N to be dependent on I, this would be a standard relativization 
of Theorem [TJ) 

Proof. Theorem S] can be proven in the same way as Theorem [TJ And it can 
also be formally derived from it: if a sequence r and a number N satisfy 
the requirements of Theorem [TJ and z : N 2 — > N is a computable bijection, 
then the sequence % h-> uji = T z un and the same number N satisfy the 
requirements of Theorem HJ for the integer Z. (The bijection adds 0(l)-term, 
but this can be compensated by a small change in a: the statement is true 
for every a < 1.) □ 

Now we can start proving the upper bound. 

Theorem 5. There exists a constant c such that for any a > 2 and I > 1 
the following inequality holds: 

1 + A < R ( a > 1 ) < ! + 4- 
al al 

Proof. The lower bound is easy (as shown in the introduction). Let us prove 
the upper bound. Let as assume first that a = 2 (the general case can be 
reduced to this special one). 

Consider a sequence oj satisfying the requirements of Theorem [T] for 
some a > |. Then the required sequence with long fractional powers will 
be constructed as 



n = OJ 
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for some mapping / : N — >• N. 

At first let us define / at the first I integers (the value of integer con- 
stant m will be chosen later): 

1. f(i) = i mod m for i < / and (i mod m) ^ m — 1 (we say that these 
indexes have rank 1). 

2. f(mi+m— 1) = (m— l) + (i mod m) for mi+m—1 < I and (i mod m) ^ 
m — 1 (we say that these indexes have rank 2). 

3. f(m 2 i + m 2 — 1) = 2(m — 1) + (i mod m) for m 2 i + m 2 — 1 < I and 
(i mod m) 7^ m — 1 (we say that these indexes have rank 3). 

(And so on until / is defined at all first I integers.) 

Then we define / on other blocks of I integers in the same way but 
using fresh bits each time. So if /({0, 1, . . . , I — 1}) = {0, 1, . . . , L — 1} then 
f(i + jl) = f(i)+jL. 

Suppose the sequence tj = 0JfU) contains some fractional power xyx with 

\xy\ > I and the exponent -; — - > 1 H — 7. Without loss of generality we 

\xy\ 21 

can assume that the exponent 1 + ^ is not greater than 2 (otherwise the 
statement of the theorem follows from the existence of a binary sequence, 
called Thue-Morse sequence, that does not contain any fractional power 
with exponent greater than 2, see [2], [3]). Also we can assume that c > 2m 
(increasing c, we make our task easier). So / > | > m and |x| > §r\xy\ > m. 

First we consider the case when both occurrences of x in xyx lie entirely 
in some blocks of size / (in two different blocks, because \xy\ > I). Denote by 
n the number of Z-sized blocks between these two occurrences of x and denote 
by k the integer number that satisfies the inequality m k ~ l < \x\ < m k . Then 
m k > |n and k > 2 (because |x| > §i\xy\ > m). 

Let us denote by / and J the sets of values of / for the first and second 
occurrences of x (respectively) whose rank is not greater than k (obviously 
there is at most 1 index in each of these occurrences of x whose rank is 
greater than k). The sets / and J are disjoint because these occurrences 
of x lies in the different /-sized blocks. Assume Z = I U J, then for some 
t € Z we have K(Z,uj(Z)\t,l) > aj^Z by the statement of Theorem H] (we 
need here that m > N + 1 since #Z should be greater than N). 

Obviously, 

l -#Z = #/ + 0(1) = #J + O(l) = (k— l)(m — 1) + + o(l). 

The set Z is determined by t, I, m, n, k, \x\ and the start/end positions for 
the two occurrences of the word x modulo m k (and one bit saying whether 
t belongs to the first occurrence of x or to the second one). So K(Z | t, I) < 
logn + 0(log(m fc )) = O(klogm) (since m k > in). We can also calculate 
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lo(Z) if u}(I) is given (we need at most one extra bit for calculating the entire 
string x). Therefore 

0(klogm) + ^#Z>a#Z, 

but a > \ and #Z > 2(k - l)(m - 1) + O(l) > k(m - 1) + O(l). So 
k(m — 1) < 0(k log m) that is a contradiction if m is large enough. (Recall 
that the choice of m was postponed.) 

Consider now the general case for the position of the two occurrences 
of x. If length of x is not large, i.e. \x\ < I, we can reduce this case to 
the previous one by splitting x into parts and choosing the largest part (we 
must multiply the constant c by 3). Now let x be longer than the block size 
(\x\ > We can assume that there is no /-sized block that intersects both 
occurrences of x (in the other case we also split the word x in parts). 

Let us denote by / and J the sets of values of / in the first and second 
occurrences of x respectively. The sets / and J are disjoint. Assume Z = 
IUJ. Then for some t G Z we have K(Z,u(Z)\t, I) > a#Z. 

The set Z is determined by t, I, m and the relative start/end positions of 
the two occurrence of the word x with respect to the one of the preimages of 
t (for example, the first one). So K(Z \ t,l) < log \xy\ + 0(log/) = 0(log |x|) 
(since \x\ > I and \x\ > -^\xy\). To compute to(Z), it is enough to know at 
most a half of it (ui(I) or oj(J), whichever is smaller). Therefore 

0(log|x|) + i#Z>a#Z, 

but a > \ and #Z = Q (^(m- l)log m /) = Q ((logM)f^) (here we 

use that \x\> I > m and lp g ^ > That is a contradiction if m is large 

enough. 

This finishes the proof for a = 2. 

Assume now that a > 6 and a is even. Let lo be the sequence constructed 
for binary alphabet and /' = s ^l. To get the required sequence v we will 
color the terms of uj into § colors: the i-th block of size I gets color i mod §. 
Then the size of the alphabet of sequence v (whose terms are now (bit, color) 
pairs) equals to a and v does not contain fractional powers z p with \z\ > 
and p > 1 + ( a ^2)i ■ And obviously v does not contain any fractional powers 
z p with I < \z\ < (because it does not contain pairs of equal letters at 
these distances). 

Therefore R(a, I) < 1 + ( a ° 2 )i if a > 6 and a is even, and R(2, 1) < 1 + t§. 
To prove the theorem for arbitrary o it remains to note that that R(a, I) 
is decreasing in a, so R{a, I) < 1 + | for every a > 2, I > 1. □ 
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